首页> 外文OA文献 >De-identification of medical records using conditional random fields and long short-term memory networks
【2h】

De-identification of medical records using conditional random fields and long short-term memory networks

机译:使用条件随机字段去除医疗记录的识别   长期的短期记忆网络

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The CEGS N-GRID 2016 Shared Task 1 in Clinical Natural Language Processingfocuses on the de-identification of psychiatric evaluation records. This paperdescribes two participating systems of our team, based on conditional randomfields (CRFs) and long short-term memory networks (LSTMs). A pre-processingmodule was introduced for sentence detection and tokenization beforede-identification. For CRFs, manually extracted rich features were utilized totrain the model. For LSTMs, a character-level bi-directional LSTM network wasapplied to represent tokens and classify tags for each token, following which adecoding layer was stacked to decode the most probable protected healthinformation (PHI) terms. The LSTM-based system attained an i2b2 strictmicro-F_1 measure of 89.86%, which was higher than that of the CRF-basedsystem.
机译:CEGS N-GRID 2016临床自然语言处理中的共享任务1着重于对精神病学评估记录的取消识别。本文根据条件随机域(CRF)和长短期记忆网络(LSTM)描述了我们团队的两个参与系统。引入了预处理模块,用于在识别之前进行句子检测和标记化。对于CRF,利用人工提取的丰富特征来训练模型。对于LSTM,应用了字符级双向LSTM网络来表示令牌并为每个令牌分类标签,然后堆叠编码层以解码最可能的受保护健康信息(PHI)术语。基于LSTM的系统的i2b2 strictmicro-F_1度量达到89.86%,高于基于CRF的系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号